19 research outputs found

    PathoScope 2.0: a complete computational framework for strain identification in environmental or clinical sequencing samples.

    Get PDF
    BACKGROUND: Recent innovations in sequencing technologies have provided researchers with the ability to rapidly characterize the microbial content of an environmental or clinical sample with unprecedented resolution. These approaches are producing a wealth of information that is providing novel insights into the microbial ecology of the environment and human health. However, these sequencing-based approaches produce large and complex datasets that require efficient and sensitive computational analysis workflows. Many recent tools for analyzing metagenomic-sequencing data have emerged, however, these approaches often suffer from issues of specificity, efficiency, and typically do not include a complete metagenomic analysis framework. RESULTS: We present PathoScope 2.0, a complete bioinformatics framework for rapidly and accurately quantifying the proportions of reads from individual microbial strains present in metagenomic sequencing data from environmental or clinical samples. The pipeline performs all necessary computational analysis steps; including reference genome library extraction and indexing, read quality control and alignment, strain identification, and summarization and annotation of results. We rigorously evaluated PathoScope 2.0 using simulated data and data from the 2011 outbreak of Shiga-toxigenic Escherichia coli O104:H4. CONCLUSIONS: The results show that PathoScope 2.0 is a complete, highly sensitive, and efficient approach for metagenomic analysis that outperforms alternative approaches in scope, speed, and accuracy. The PathoScope 2.0 pipeline software is freely available for download at: http://sourceforge.net/projects/pathoscope/

    PathoScope 2.0: a complete computational framework for strain identification in environmental or clinical sequencing samples.

    Get PDF
    BACKGROUND: Recent innovations in sequencing technologies have provided researchers with the ability to rapidly characterize the microbial content of an environmental or clinical sample with unprecedented resolution. These approaches are producing a wealth of information that is providing novel insights into the microbial ecology of the environment and human health. However, these sequencing-based approaches produce large and complex datasets that require efficient and sensitive computational analysis workflows. Many recent tools for analyzing metagenomic-sequencing data have emerged, however, these approaches often suffer from issues of specificity, efficiency, and typically do not include a complete metagenomic analysis framework. RESULTS: We present PathoScope 2.0, a complete bioinformatics framework for rapidly and accurately quantifying the proportions of reads from individual microbial strains present in metagenomic sequencing data from environmental or clinical samples. The pipeline performs all necessary computational analysis steps; including reference genome library extraction and indexing, read quality control and alignment, strain identification, and summarization and annotation of results. We rigorously evaluated PathoScope 2.0 using simulated data and data from the 2011 outbreak of Shiga-toxigenic Escherichia coli O104:H4. CONCLUSIONS: The results show that PathoScope 2.0 is a complete, highly sensitive, and efficient approach for metagenomic analysis that outperforms alternative approaches in scope, speed, and accuracy. The PathoScope 2.0 pipeline software is freely available for download at: http://sourceforge.net/projects/pathoscope/

    A comprehensive assessment of demographic, environmental, and host genetic associations with gut microbiome diversity in healthy individuals.

    Get PDF
    BACKGROUND: The gut microbiome is an important determinant of human health. Its composition has been shown to be influenced by multiple environmental factors and likely by host genetic variation. In the framework of the Milieu Intérieur Consortium, a total of 1000 healthy individuals of western European ancestry, with a 1:1 sex ratio and evenly stratified across five decades of life (age 20-69), were recruited. We generated 16S ribosomal RNA profiles from stool samples for 858 participants. We investigated genetic and non-genetic factors that contribute to individual differences in fecal microbiome composition. RESULTS: Among 110 demographic, clinical, and environmental factors, 11 were identified as significantly correlated with α-diversity, ß-diversity, or abundance of specific microbial communities in multivariable models. Age and blood alanine aminotransferase levels showed the strongest associations with microbiome diversity. In total, all non-genetic factors explained 16.4% of the variance. We then searched for associations between > 5 million single nucleotide polymorphisms and the same indicators of fecal microbiome diversity, including the significant non-genetic factors as covariates. No genome-wide significant associations were identified after correction for multiple testing. A small fraction of previously reported associations between human genetic variants and specific taxa could be replicated in our cohort, while no replication was observed for any of the diversity metrics. CONCLUSION: In a well-characterized cohort of healthy individuals, we identified several non-genetic variables associated with fecal microbiome diversity. In contrast, host genetics only had a negligible influence. Demographic and environmental factors are thus the main contributors to fecal microbiome composition in healthy individuals. TRIAL REGISTRATION: ClinicalTrials.gov identifier NCT01699893

    Elucidating microbial codes to distinguish individuals

    No full text

    Temporal Stability of the Human Skin Microbiome.

    No full text
    Biogeography and individuality shape the structural and functional composition of the human skin microbiome. To explore these factors\u27 contribution to skin microbial community stability, we generated metagenomic sequence data from longitudinal samples collected over months and years. Analyzing these samples using a multi-kingdom, reference-based approach, we found that despite the skin\u27s exposure to the external environment, its bacterial, fungal, and viral communities were largely stable over time. Site, individuality, and phylogeny were all determinants of stability. Foot sites exhibited the most variability; individuals differed in stability; and transience was a particular characteristic of eukaryotic viruses, which showed little site-specificity in colonization. Strain and single-nucleotide variant-level analysis showed that individuals maintain, rather than reacquire, prevalent microbes from the environment. Longitudinal stability of skin microbial communities generates hypotheses about colonization resistance and empowers clinical studies exploring alterations observed in disease states. Cell 2016 May 5; 165(4):854-66

    Clinical PathoScope: Rapid alignment and filtration for accurate pathogen identification in clinical samples using unassembled sequencing data

    No full text
    Background The use of sequencing technologies to investigate the microbiome of a sample can positively impact patient healthcare by providing therapeutic targets for personalized disease treatment. However, these samples contain genomic sequences from various sources that complicate the identification of pathogens. Results Here we present Clinical PathoScope, a pipeline to rapidly and accurately remove host contamination, isolate microbial reads, and identify potential disease-causing pathogens. We have accomplished three essential tasks in the development of Clinical PathoScope. First, we developed an optimized framework for pathogen identification using a computational subtraction methodology in concordance with read trimming and ambiguous read reassignment. Second, we have demonstrated the ability of our approach to identify multiple pathogens in a single clinical sample, accurately identify pathogens at the subspecies level, and determine the nearest phylogenetic neighbor of novel or highly mutated pathogens using real clinical sequencing data. Finally, we have shown that Clinical PathoScope outperforms previously published pathogen identification methods with regard to computational speed, sensitivity, and specificity. Conclusions Clinical PathoScope is the only pathogen identification method currently available that can identify multiple pathogens from mixed samples and distinguish between very closely related species and strains in samples with very few reads per pathogen. Furthermore, Clinical PathoScope does not rely on genome assembly and thus can more rapidly complete the analysis of a clinical sample when compared with current assembly-based methods. Clinical PathoScope is freely available at:http://sourceforge.net/projects/pathoscope/ webcite
    corecore